AITopics | larger class

Update: Thank you for the feedback, I have read it as well as other reviews. Compared with the vast literature on obtaining upper bounds on convergence rates of stochastic convex optimization problems, less work has been done towards deriving corresponding lower bounds that depict the fundamental hardness of these problems. This paper aims to fill this gap and proposes a general framework for comparing upper and lower bounds. The framework also suggests potential future research directions for obtaining better convergence rates. In addition, this paper proved, for all round t, a lower bound on the expected convergence rate of SGD over any diminishing step-size sequences when applied to strongly convex problems. This bound shows that the step-size schemes proposed in recent work (Gower et al. 2019, and Nguyen et al. 2018) are optimal within a dimension independent constant factor.

algorithm, diminishing step size, tight dimension independent lower bound, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.37)

Add feedback

Online Gradient Boosting

Neural Information Processing SystemsOct-11-2024, 09:11:21 GMT

We extend the theory of boosting for regression problems to the online learning setting. Generalizing from the batch setting for boosting, the notion of a weak learning algorithm is modeled as an online learning algorithm with linear loss functions that competes with a base class of regression functions, while a strong learning algorithm is an online learning algorithm with smooth convex loss functions that competes with a larger class of regression functions. Our main result is an online gradient boosting algorithm which converts a weak online learning algorithm into a strong one where the larger class of functions is the linear span of the base class. We also give a simpler boosting algorithm that converts a weak online learning algorithm into a strong one where the larger class of functions is the convex hull of the base class, and prove its optimality.

algorithm, larger class, online, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.66)

Add feedback

Online Gradient Boosting

Beygelzimer, Alina, Hazan, Elad, Kale, Satyen, Luo, Haipeng

Neural Information Processing SystemsFeb-14-2020, 11:43:45 GMT

We extend the theory of boosting for regression problems to the online learning setting. Generalizing from the batch setting for boosting, the notion of a weak learning algorithm is modeled as an online learning algorithm with linear loss functions that competes with a base class of regression functions, while a strong learning algorithm is an online learning algorithm with smooth convex loss functions that competes with a larger class of regression functions. Our main result is an online gradient boosting algorithm which converts a weak online learning algorithm into a strong one where the larger class of functions is the linear span of the base class. We also give a simpler boosting algorithm that converts a weak online learning algorithm into a strong one where the larger class of functions is the convex hull of the base class, and prove its optimality. Papers published at the Neural Information Processing Systems Conference.

algorithm, larger class, online, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.66)

Add feedback

Filters

Collaborating Authors

larger class

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD

e6acf4b0f69f6f6e60e9a815938aa1ff-AuthorFeedback.pdf

Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD

Reviews: Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD

Online Gradient Boosting

Online Gradient Boosting